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Do scientists follow hot topics in their scientific investigations? In this paper, by performing analysis to 
papers published in the American Physical Society (APS) Physical Review journals, it is found that papers 
are more likely to be attracted by hot fields, where the hotness of a field is measured by the number of papers 
belonging to the field. This indicates that scientists generally do follow hot topics. However, there are 
qualitative differences among scientists from various countries, among research works regarding different 
number of authors, different number of affiliations and different number of references. These observations 
could be valuable for policy makers when deciding research funding and also for individual researchers 
when searching for scientific projects. 

The phenomenon of "the rich get richer", which is also called preferential attachment in the field of complex 
networks 1 , is quite common in many fields 2,3 (for example, see references cited in Table I); however, the 
scientific field is composed of scientists, a special group of people who focus on proposing, investigating and 
implementing original and creative ideas. Therefore, it is plausible that the "the rich get richer" phenomenon is 
less pronounced in the fields investigated by scientists than in other areas. Ideally scientists choose their fields of 
investigation according to their scientific interest and the scientific value of the investigated questions but not due 
to the hotness of the investigated fields. In this work, using published papers from the American Physical Society 
(APS) Physical Review journals beginning in 1976 and ending in 2009, we test whether the subject of a new paper 
is more likely to be in a hot field than in a relatively unknown field when the paper is published. We also compare 
scientists from different countries. Such comparisons could provide insightful and interesting information. In 
China, modern scientific development is still very young. It is believed among many scientists that there are many 
more Chinese scientists that are followers than original thinkers compared with many other countries. In this 
work, we offer direct empirical support for this hypothesis. Finally, we also determine if the degree of tracing hot 
fields differs for papers with different number of authors or affiliations and different number of references. 
Interestingly, it is found that scientists who collaborate with more authors or more affiliations tend to follow 
hotter fields than those who works with a few collaborators or affiliations. Moreover, papers with a small number 
of references, on average, are more interest-driven or value-driven, whereas papers with a large number of 
references are more hotness-driven. These empirical discoveries, particularly if it is also performed in other fields 
and for a larger periods of time, could provide valuable information for policy makers. 

Results 

Empirically, the phenomenon of preferential attachment has been found in many systems. We compile a list of 
typical systems, where their positive exponent a values are shown in Table I, indicating that new links are more 
likely to attach to nodes with larger degree. Later, we will compare our results on hotness tracing of newly 
published papers against other phenomena listed in this table. 

Firstly, we examine the phenomenon of preferential attachment of papers in the PR-PACS data set (see 
Methods for details). In a log-log plot, Fig. 1 (a) displays the accumulated distribution function ic(fc) with respect 
to the size k of the field that a newly published paper belongs to. The positive exponents a. indicate that new papers 
are more likely to focus on hot fields (larger sizes), i.e., new links preferentially attach to nodes (PACS) with larger 
sizes. Or to say, generally, scientists do publish more new papers in current hot fields. We obtain the exponents a 
by least-square fitting from k = 1 to k = 300 as the curves deviate from the straight line for large k due to low 
statistics. For different years f, all of k(/c) follow power law, namely jc(fc) ~ k" + 1 , but slightly different parameters a 
(as shown in Fig. 1 (b)). We also plot the distribution of fields' size N(k) as an inset in Fig. 1 (a), which follows a 
highly skewed distribution. 
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Table I [ A list of some values of a, the degree of preferential attachment in different networks collected from literature, showing also the 
number of nodes N (and M for bipartite networks), the number of links E. The first column is the name of the investigated database, and the 
discussed relationships are discussed within the brackets. Please note that APS-P ACS, Google Earth and Digg are bipartite networks 
between individuals and groups while all other networks describe relations among individuals 

Network N (M) E a Ref. 



APS-P ACS (papers belong to fields) 

Digg (Membership of group) 

Linkedin (friendship) 

Sexual networks (sexual contact) 

Google Earth (Membership of group) 

Medline (coauthorship) 

NYGI(coproduction) 

Bar (communication) 

Google(communication) 

Flickr (following) 

aNobii (friendship, following) 

Douban (following) 

Wealink (friendship) 

Citation (Citation) 



N[M) 

N:3 15,082 (M:5,472) 
N:2 12,635 (M:50) 
7,550,955 
260-1220 
N:23 1,976 (M:54) 
1,648,660 
10,000 
3,988 
39,918 
584,207 
86,800 
1,614,288 
223,482 
1,736 



900,832 
1,185,167 
30,682,028 

345,038 

700,000 



3,554,130 
697,910 
14,573,170 
273,209 
83,252 



0.52-0.87 

0.69 

0.6 
0.32-0.80 

1.11 
1 .04 ± 0.04 



1.20 ± 0.06 
1.25 + 0.13 
1.36 ± 0.14 

1.0 

1.0 

0.95 

1.0 
0.95 ± 0.1 



5 
6 
4 
5 
1 1 
15 
16 
16 
6 
17 
18 
19 
12 



Compared with the preferential attachment phenomena in other 
fields listed in Table I, the exponents a = 0.52-0.87 from our PR- 
PACS data are near the lower end of all the exponents. Out of all the 
other 13 entries in Table I, only sexual contact in sexual networks (a 
= 0.32-0. 80) 4 , users attached to membership of groups of Digg (a = 
0.69) 5 and friendship relations in Linkedin (a = 0.6) 6 are approxi- 
mately at the same level of hotness tracing. It might be easy to "fol- 
low" a star member in a social networking website, such as Flickr (a 
= 1 .0) 6 ; however, it might not be so easy to shift a research field, join a 
user group of a different product or change sexual partners towards 
hotter choices. Besides the difficulty in changing fields, it might also 
because that scientific works are more independent and more inter- 
est-driven or value-driven than other fields. 

Secondly, we test whether the intensity of tracking hot topics of 
scientific research differs in different countries. Therefore, we classify 
the papers according to countries of the first author's affiliation, and 
calculate the absolute contribution ratios r c of several major coun- 
tries within PR-PACS data set. They are USA (33.07%), Germany 
(9.95%), Japan (6.94%) and China (3.73%). As seen from Fig. 2 (a), in 
the year 2008 the exponent (a = 1.06) of P. R. China is larger than 
that of other countries, e.g., USA 0.74, Germany 0.86 and Japan 0.93. 
Moreover, as shown in the inset of Fig. 2 (a), the exponent a for China 
is generally larger than that of other countries for different years. 
These results indicate that the phenomenon of tracing hotness is 



more severe among Chinese scholars. In addition, it is found that 
UK (3.80%) and France (5.55%) lie between USA and Germany, 
while Russia (1.89%) lies between Germany and China. 

To provide a comparative study, we also calculate the relative 
contribution ratio R%k) of papers from country c and belonging to 
the fields with size k (see Methods for the details). As seen from Fig. 2 
(b), the relative contribution ratio R c (k) of Chinese scholars is smaller 
than 1 in cold fields (smaller k) but larger than 1 in hot fields (larger 
k), indicating that Chinese scholars make less contributions to cold 
fields than their average contribution but more to hot fields than 
their average. Meanwhile, the situation of USA is opposite to that 
of China. This difference also indicates that Chinese scientists are 
more keen to follow hot topics than United States scientists from 
another aspect. This agrees with our previous observations. 

Considering the fact that scientific studies in China are still young, 
it is understandable that a large percentage of them are on hotness- 
driven fields rather than value-driven fields. The discovered order - 
the USA, Germany, Japan and China - of degree of hotness tracing 
makes sense intuitively. These results are more or less consistent with 
our intuitions. How different positions are related to scientific pol- 
icies of that country, or even the culture and values of that country, 
although is definitely worth a further investigation, is outside the 
scope of the current study. We simply want to demonstrate the 
capability of the methods that are discussed above in analyzing 




k t 

Figure 1 | Empirical preferential attachment to hot fields of new papers in the PR-PACS data set. (a) The cumulative probability functions K(k) 
in the years 2000 and 2008. Inset: the distribution function of the sizes of fields, (b) The exponents a. for different years. We start our measurements from 
1995, not 1976, the beginning year of the data, since the method requires a relatively large initial system. Notice a = 0.52-0.87 is near the lower end 
of all the exponents collected in Table I. The 2009 exponent is relatively smaller compared to its previous several years for reason that we do not yet know. 
If not mentioned, Af is one year in this paper. The straight lines are guide to the eye through this paper. 
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Figure 2 | Results for authors from the most-contributed countries in the PR-PACS data set. (a) The cumulative probability functions k(K) in the year 
2008 for USA, Germany, Japan and P. R. China. The exponent of Chinese authors (a = 1.06) is much larger than that of other countries. Inset: The 
exponents a for different years, where the exponents a. of China always are larger than that of USA. (b) The relative contribution ratio R'(k) for USA, 
Germany, Japan and P. R. China. 



publication records, and to present some basic discoveries using the 
methods in this work. 

Next, we measure the influence of different number of authors and 
affiliations on the degree of tracing hot topics. Therefore, we classify 
the papers according to their number of authors and number of 
affiliations. It is argued in Ref. 7 that research works with many 
authors or many affiliations typically focus more on hot topics 
because it might exactly be the hotness of the paper subject that made 
collaboration attractive among the scientists and that a joint task 
team is generally more likely to focus on short-term projects rather 
than long-term projects. Here we make such an examination based 
on the PR-PACS data. We can see from Fig. 3 that overall, a, the 
degree of severity of tracing hot topics, increases with the number of 
authors and affiliations. These results provide empirical supports for 
the arguments in Ref. 7. In a sense, global collaboration is not neces- 
sarily a good strategy for high-quality, value-driven research topics as 
suggested in Ref. 7. 

Moreover, it is also claimed that the size of teams and the share of 
multi-university collaboration grows rapidly in recent decades 8-10 , 
and teams and multi-university collaborations tend to produce more 
highly cited papers 910 . It is believed that generally, the papers belong- 
ing to hot fields can easily acquire more citations than that of cold 
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Figure 3 | The preferential attachment exponents a with respect to the 
number of authors N Aut (the number of affiliations N A ff in the inset) in 
the year 2000 and 2008. Note that every three numbers for authors are 
grouped together and labelled as the intermediate number, e.g., 1, 2 and 3 
are grouped together and labelled as 2. Overall, the exponents a increase 
with number of authors and affiliations. 



fields as there are more papers focusing on similar topics. Therefore, 
the fact discovered in 810 does not imply that multi-university col- 
laboration and larger teams are making bigger and significant pro- 
gress. It might more due to that fact that the fields they are working 
on is bigger. We check this hypothesis in this section to see if the 
papers with more authors and more affiliations are more likely to 
attach to hot fields. It is indeed so as shown in Fig. 3. From this view 
point, our results are consistent with the claims in Ref. 9,10. 

Finally, we investigate the effect of the number of references on 
tracking hot topics. We classify the papers according to their number 
of references. It is obvious that the average number of references in 
papers today is much larger than that of early times. For earlier times, 
one can intuitively hypothesize that a pioneer paper or a paper of 
good quality typically cited less references. However, today, the num- 
ber of references may or may not relate to how innovative the paper 
is. Here, we examine this hypothesis. As seen from Fig. 4, the expo- 
nents a, the degree of severity of tracing hot topics, increase with the 
number of references, which could indicate that papers with a larger 
number of references are more likely to be on hot topics. Notice that 
the absolute values of a for larger number of references in recent 
years are larger than that in earlier years, suggesting that scientists 
trace hot fields a bit more severely in recent years than in earlier 
times. Therefore, the hypothesis is reasonable overall. 
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Figure 4 | The preferential attachment exponents a with respect to the 
number of references N Ref . Clearly, the exponents a increase with the 
number of references. The absolute value of a is a bit larger in recent years 
than in early years for larger number of references. 
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Discussion 

We have found that in the tested fields of science for papers published 
by the APS physical review journals, hot fields attract more newly 
published papers; however, scientific works are not as hotness-driven 
compared with other fields. Among the major countries, this phe- 
nomenon of tracing hotness occurs more in China compared with 
other countries, which could be related to the fact that China is still a 
developing country in terms of scientific research. We also found that 
papers with more authors, more affiliations and more references 
were on subjects that were more hotness-driven. This observation 
could potentially be valuable to policy makers that fund scientific 
projects. 

Here only data of physics publications were analyzed. A cross-field 
comparison would be interesting, particularly if one can do similar 
studies on math and social science, such as economics. Similar stud- 
ies can be applied on producing, selling or buying products. That is to 
investigate when a product is manufactured, sold or bought, how 
often is it related to the number of the product that has been man- 
ufactured, sold or bought already. Such studies could be valuable for 
marketing research. 

It might also be interesting to determine how often outstanding/ 
important works when they are first published are in hot fields. Fields 
gradually or suddenly become hot after major breakthroughs are 
presented in a few pioneer papers, which later on, might be awarded 
or honoured. Using records of awards such as the Nobel prize, the 
Fields Medal, the Turing Award etc., it would be possible to perform 
a statistical analysis of papers with major breakthroughs. All the 
award-winning papers could be collected, where the same analysis 
to determine a can be performed, which could then be compared 
with the overall a. 

The above investigation can be performed, not only at the macro 
level of counties of authorship but also at the mesoscopic level of 
affiliations of authorship. In this way, one might be able to compare 
tradition, culture and research quality at various levels of academic 
units. In principle, one could also collect all published papers of one 
author and treat all those papers as a subset and study the preferential 
attachment phenomenon of those papers if the size of such collection 
is large enough. It would then be possible to use it to measure the 
creativity and originality of a researcher. 

Methods 

Data description and notations. Our data set is a collection of all papers published by 
the American Physical Society (APS) Physical Review journals from 1976 to 2009. 
Each paper is recorded as a data entry, which includes its title, date of publication, 
classification number according to the AlP-Physics and Astronomy Classification 
Scheme (PACS), author(s), affiliation(s) and reference(s) to other papers within the 
data set. The entire data set contains more than 320, 000 entries, including a variety of 
article types, such as article, brief report, rapid communication, comment, reply, 
erratum, essay, announcement, editorial, announcement and so on. Here, we will only 
consider those research papers, e.g., article, brief report and rapid communication, 
with PACS numbers and refer to it as the PR-PACS data set. At last, we have N — 315, 
082 entries, which each entry, i.e. each paper, is denoted as i, and M = 5, 472 PACS, 
which each entry, i.e. each PACS number, is denoted as X. 

We use PACS, the established hierarchical classification systems of physics, to 
identify the fields within the physics field. The investigated system of APS-PACS is a 
bipartite network, where there are two kinds of nodes, i.e., papers and PACS, and one 
kind of links, i.e., relation between paper and PACS, which is determined by the 
researchers. Mathematically, we can use one matrix to characterize the relation 
between paper and PACS. A is an N X M adjacency matrix, with element a a — 1 if 
paper i belongs to PACS X, or to say, PACS X appears in paper /, otherwise 0. 
Generally, one paper can belong to several fields (have several PACS), therefore there 
are a total of 900,832 links between papers and PACS, i.e., E= ^^.^ «M- The size of a 

field, iCi a,}_, is defined as the number of papers that belong to it, describing the 

popularity (hotness) of the fields. Then, we can calculate the number N(k) of fields 
with size k. Based on PR-PACS data, our main concern is to observe whether new 
papers would like to attach to hot fields (with larger size k), i.e., whether new links 
preferentially attach to the nodes with larger degree. 

Measuring preferential attachment. Here, we explain our method for this statistical 
analysis for testing preferential attachment on temporal data set. The basic idea is to 
investigate whether new links are likely to attach to nodes with larger degree (size). 



We calculate the empirical value of the relative probability T(k) that a new paper 
published within a short period At connects to a field which has a size of k before the 
time f 11 as follows. Since the corresponding time- dependent absolute probability Pk(t) 
that a new paper published in a field with size k is proportional to T(k)nk(t)/N{t), 
where n k (t) is the number of fields with size k and N(t) is the number of fields 
immediately before time t, then T(k) can be estimated by making a histogram of the 
sizes k of the fields to which each paper is added within the time period At in which 

each sample is weighted by a factor of , 

F h 1 n k (t) 



h(0 = k 



T(k) = 



agN(t) 



(1) 



where k^{t) = k means that the field, to which the papers published within the period 
At belong, has size k at time t. We now have the empirical curve T(k) from the above 
statistical analysis. In order to conveniently compare T(k) with different time U T(k) is 

normalized as T'(k) = ^ ^ — 12,13 . 

Ek T(k ) 

The preferential attachment hypothesis states that the rate V (k) with which a node 
with k links acquires new links is a monotonically increasing function of k\ namely 



T(k)-- 



= C(t)k* 



For BA model y. = l 1 . To obtain a smooth curve from noisy data, we take the 
cumulative function form instead of T'(k): 

K (k)= \ T(k)dk' 



(2) 



(3) 



Thus, k (k) should be proportional to k a+1 . We can now fit the empirical curve from 
the previous statistical analysis and then compare it against this hypothesized curve of 
preferential attachment. This is the general procedure of all the analysis presented in 
this work. This method has been extensively used to measure the preferential 
attachment phenomenon in different kinds of networks 4-611 " 19 . 

To test the preferential attachment of scientific research differs in different coun- 
tries, we separated the entire data set according to countries of the first author's 
affiliation and then perform a comparison among the most contributed countries 
(USA, Germany, Japan, China). With this separated data set, we perform the exam- 
ination of preferential attachment only counting the papers from authors in country c 



k x (t)=k,affi=< 



r(k)= 



E 



a iX N(t) 
nk(t) 



(4) 



Here affj — c means the principle affiliation of this paper i is in country c. In counting k 
and n^t), we included papers from all countries, meaning that scientists from all 
countries face the temptation of tracing the same overall hotness in the entire PR- 
PACS data set. Similarly, besides countries, the above calculation can be applied to 
any features of papers, such as different number of authors, affiliations and references. 

Measuring relative contribution ratio R c (k). Absolute contribution from a country 
is measured simply by a percentage of published papers from that country out of the 
total number of published papers, 



(5) 



where m k is the number of papers (from country c) belonging to fields with size 
k. Here we present a more detailed breakdown of this absolute contribution by 
looking at each individual field what is the percentage of papers from that country out 
of all papers in that field, and then normalized by the absolute contribution of that 
country, 

^ ( fc)=^i. (6) 
m k t* 

This is a static measure, so it is easy to perform. In a sense it also describes how often 
scientists in that country are pursuing hot fields. 
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